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ON FALSE DISCOVERY CONTROL UNDER DEPENDENCE 

By Wei Biao Wu 

University of Chicago 

A popular framework for false discovery control is the random 
effects model in which the null hypotheses are assumed to be inde- 
pendent. This paper generalizes the random effects model to a con- 
ditional dependence model which allows dependence between null 
hypotheses. The dependence can be useful to characterize the spatial 
structure of the null hypotheses. Asymptotic properties of false dis- 
covery proportions and numbers of rejected hypotheses are explored 
and a large-sample distributional theory is obtained. 

1. Introduction. Since the seminal work of Benjamini and Hochberg 
(BH) [2], the paradigm of false discovery control has been widely used in 
multiple hypothesis testing problems and it is often more useful than the 
classical Bonferroni-type method. Suppose that we want to test n hypothe- 
ses Hi, 1 < i < n. Write Hi = if the ith null hypothesis is true and Hi = 1 
if otherwise. Let V be the number of erroneously rejected null hypotheses 
which are actually true and let R be the total number of rejected hypotheses. 
The false discovery proposition (FDP) is defined as 

V 

(1) FDP = — where a V b = max(a, b), 

and the false discovery rate (FDR) is defined as the expected value K(FDP). 

We now briefly describe the BH procedure. Let Xj be the marginal p- value 
of the ith test, 1 < i < n, and let Xn\ < ■ ■ ■ < X( n ) be the order statistics of 
Xi, . .. ,X n . Given a control level a £ (0, 1), let 

(2) R = max{i G {0, 1, . . . , n + 1} : Xu) < ai/n}, 

where X( ) = and X( n+1 ) = 1. The BH procedure rejects all hypotheses for 
which X(^ < X(Ry If R = 0, then all hypotheses are accepted. Assume that 
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Xi, 1 < i < n, are independent and the p- value distribution is continuous; 
BH [2] proved that, if there are Nq true null hypotheses, then K[V/(R V 
1)] = aNo/n. A popular framework for the false discovery control is the 
random effects model or the two-component mixture model (McLachlan and 
Peel [12]) in which the null hypotheses Hi, I <i<n, are assumed to be 
independent Bernoulli random variables. In particular, one assumes that 
(X{,Hi) are independent and identically distributed (i.i.d.) with 

(3) P(Xi < x\Hi = 0) =x, P(Xi<x\Hi = l) = G(x), 0<x<l, 

and Hi ~ Bernoulli (m) [viz., F(Hi = 1) = tti and vr = 1 - tti = P(flj = 0)]. 
Here G is the distribution function of the p- value Xi under alternative hy- 
potheses. It is commonly assumed that Xi ~ uniform(0, 1) if Hi = 0. 

Due to the independence assumption, the classical random effects model 
or the two-component mixture model does not allow one to model spatial or 
location structures of the null hypotheses. In certain applications one expects 
that false null hypotheses occur in clumps, which are spatially clustered. In 
this case it is reasonable to expect that, if Hi = 1, then the nearby hypothe- 
ses Hj, where j is close to i, are more likely to be false. In the negative 
dependence case the occurrence of Hi = 1 prevents nearby hypotheses from 
being false. Recently the multiple testing problem under spatial dependence 
has been considered by Qiu et al. [16] for microarray data and by de Castro 
and Singer [6] for geographical data. 

In this paper we shall consider the problem of false discovery control with- 
out the independence assumption. In particular, we propose the conditional 
independence model: Let (Hi) be a 0/1-valued stationary process, and, given 
(ffj)™ =1 , Xi are independent. The dependence is imposed on the hypotheses 
(Hi). A simple relaxation of the independence assumption on (Hi) is to im- 
pose a Markovian structure. In this case it is interestingly related to hidden 
Markov models (see Section 3). 

As demonstrated in Storey, Taylor and Siegmund [19], Genovese and 
Wasserman [9], Chi [5] and Meinshausen and Rice [13] among others, the 
theory of empirical processes plays a useful role in the study of false dis- 
covery control. Recently Wu [24] considered empirical distribution functions 
for a wide class of stationary processes. In this paper we shall deal with the 
p-values arising from the aforementioned conditional independence model. 
In particular, we shall prove the validity of the BH procedure and present 
a distributional theory for R, the number of rejected hypotheses. We shall 
also establish a Bahadur-type asymptotic expansion for the false discovery 
proportion V/(R V 1) and the weak convergence of false discovery processes 
to Gaussian processes. 

The rest of the paper is structured as follows. Our dependence struc- 
ture and main results are presented in Section 2 and proved in Section 4. 
Applications to Markov models and linear processes are given in Section 3. 
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2. Main results. We assume that (H s ) seZ d is a stationary random field 
and, for presentational simplicity, we shall consider testing hypotheses H s 
over d-dimensional cubes (cf. Condition 1). Results obtained in the paper 
can be generalized without essential difficulties to other types of regions. 
For a random variable £ write ||£|| = {Ed^l 2 )} 1 / 2 . Denote by =>- the weak 
convergence and by iV(/x, a 2 ) a normal distribution with mean \x and variance 
a 2 . Let M denote a standard normal random variable. 



Condition 1. Let (H s ) s£ %d be a stationary, 0/1-valued random field. 
For m, . . . ,nd £ N let the (/-dimensional cube C = {1,2, . . . ,ni} x • • • x {1,2, 
. . . , n^} and n = ni?i2 • • • na- Write the sum Nq = J2 s ec H s - Let 7Tq = ¥(H S = 
0) and 7ri = 1 — 7To. Assume that, as min/K^nfe — > oo, || iVc — n7ri || = 0(y/n), 
and the central limit theorem (CLT) n~ l l 2 (Nc — rwri) =>■ ^(O,^ 2 ) holds for 
some o" 2 < oo. 



In Section 3 we will present examples that Condition 1 is satisfied. With 
a slight abuse of notation, we write (H s ) se c as (Hi)f = i, where i = 1, . . . ,n 
corresponds to the lexicographic ordering of s G C. 

Under the conditional independence model we can have the representation 

(4) X i = (l-H l )U i + H l G~\U i ), 

where XJ% are independent and identically distributed (i.i.d.) uniform(0, 1) 
random variables which are also independent of (Hi)f =1 , and G~ l {u) = 
inf{x € [0, 1] : G(x) > u} is the inverse of G. Clearly (4) implies that the 
conditional distribution [Xi\Hi = 0] is uniform(0, 1) and [X{\Hi = 1] is G. If 
(Hi) are independent, then (4) reduces to the random effects model. Our 
dependence paradigm is different from earlier ones adopted in Farcomeni [7] 
and Benjamini and Yekutieli [3]. 

Following Genovese and Wasserman [9], we consider the false discovery 
process 

(5) T ^ = ^uWT0 i ' 

nJ^ n {t) +ll i=1 ±Xi>t 

where 

n 1 n 

(6) K n (t) = -Y^(l ~ Hi)l Xi < t and F n (t) = - ^ l x .< t . 

i=l i=l 

Then F n is the empirical process of X\ , . . . , X n and A n can be interpreted 
as a marked empirical process. Let 

1 n 

(7) A n (t) = -J2 H ^i<t = " A »W- 
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Note that Hil Xi <t = -Hi J-G- 1 and (1 _ - H i) 1 x i <t = (l--Hi)lt/i<t- Under 
the conditional independence model (4), we have for < t < 1 that 

A(t) :=EA n (t) =tir and A(t) := EA„(t) = G{b)ir\. 

To obtain large-sample properties of the false discovery process r n , we 
need to establish an asymptotic theory for A n (t) — A(t) and A n (t) — A(i). 
Theorem 1 below concerns the weak convergence of y/n[A n (t) — A(t)] and 
y/n[A n (t) — A(t)] in a functional space. Let T>[0, 1] be the collection of 
functions which are right continuous and have left limits; let £> 2 [0,1] = 

f2) '■ fi-, H G f [0, 1]}. Assume throughout the paper that G has a bounded 
density g = G' , namely, sup xg [ ^ g(x) < oo. Asymptotic results in Theorems 
1-4 below are meant as mm^dn^ — ► oo. 

Theorem 1. Assume Condition 1. T/ien i/iere exisi izg/it centered Gaus- 
sian processes W\(t) and W&{t), < t < 1, such that the weak convergence 

(8) (v^{A„(t) - A(t)}, y^{A n (t) - A(t)}) (W A (t),W A (t)) 

holds in the space T> 2 [0, 1]. 

Since F n is a nondecreasing function and X(j\ is the jth quantile of F n , 
the value R defined in (2) satisfies R = max{0 < j < n:j/n < F n (aj/n)}. 
Let 

^BH = sup{t G [0, 1] : t/a < F n (t)} and 

(9) 

Vq = sup{t E [0, 1] : t/a < F(t)}. 

It is easily seen that R < nuQ^/a < R + 1. Let /(x) = F'(x) and 

1 1 1 



(10) /(0) F'(0) vro + vr l5 (0)- 

If 7Ti and (7(0) are large, then a* is small. Theorem 2 below describes asymp- 
totic behavior of i^bh and suggests a dichotomous phenomenon. It gives a 
Bahadur representation of v-qk when a > a* and R = Oj>(1) when a < a*. 
At the boundary case a = a* we have an interesting nonstandard limiting 
distribution with a cubic root normalizing constant. In the case of random 
effects model in which Hi are i.i.d., Chi [5] obtained interesting results on 
strong convergence properties of R for the two cases a > a* and a = a* . Chi 
also obtained a distributional result for R when a < a* and argued that the 
number of rejected hypotheses is bounded even if there is a positive propor- 
tion of untrue null hypotheses. Chi's work shows the criticality phenomenon 
of false discovery rate controlling procedures. 



Theorem 2. Assume Condition 1. 
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(i) Ifa^>a 1 >f{vo), then 

fti\ F n {v )-vo/a _ 3/4 

(11) z^BH-^o = 1 77 \ +O v [n °< ). 

or — } (ya) 

Consequently v^^BH — ^o) =^ A(0,o" 2 ) /or some a 2 < oo. 

(ii) Ifa<a*, then R = Op(l) . 

(iii) Ifa = a* and c = -/ / (0)/[2 v //(0)] > 0, tfiera 

(12) n^^BH [ma X (AA/ Co , 0)] 2 / 3 . 

In the classical almost sure Bahadur representation theory for sample 
quantiles, one has the error bound 0[n _3 / 4 (logn) 1//2 (loglogn) 1//4 ] (see Shorack 
and Wellner [17]). We expect that the bound Op(n -3 / 4 ) in (11) is optimal 
up to a multiplicative logarithmic factor. 

Theorem 3(i) gives asymptotic properties of FDP, which is the value of 
the false discovery process T n at a random time vbr, while Theorem 3(h) 
concerns false nondiscovery proportion (FNP). FNP is the proportion of null 
hypotheses being accepted which are actually false. Since G is continuous, 
the FDR is ano (BH [2]). As pointed out in Genovese and Wasserman [9], it 
is not easy to study FDP since the random time z^bh and the false discovery 
process T n (-) are dependent; recall (9). The relation (13) gives an asymptotic 
expansion for r n (i^n) — octtq with a good error bound Op(n -3 / 4 ) and the 
term A n (i^o) — A(z/q) is easier to work with. It seems that the asymptotic 
expansion is new even in the special case of independent null hypotheses. 

Theorem 3. Assume Condition 1 and a^ 1 > aT 1 > f(vo). 

(i) We have 

(13) r n ^ BH ) - avro = -[A„(^o) - A(i/ )] + Op(n" 3/4 ). 

Consequently v^[Tn (^bh) — osttq,] => A(0,o"q) for some Oq < oo. 

(ii) Let XI — max^Tx X{ ciTid define the false Tiofidiscovevy process 

= i p mil — r where An{t) = -Y,H i i Xi>t . 

1 - F n (t) + l x *<t/n n fr{ 

Let c = 7r (a - I) /[I - af(vo)] + l-u /a and E(t) =tti[1 - G(t)]/[1 - F(t)]. 
Then 

rU N w r „ n ^ c[F n .(t)-F(t)] A ra (t)-EA w (t) 3/4 

(14) -(-bh)-^o)= (1 _, o/a)2 + 1 _ Vb/Q +0 ¥ (n /), 

and consequently y/n[B, n (uB}j) — H(i/q)] =>• A(0,cj 2 ) /or some a 2 < oo. 
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We shall now discuss the estimation of the proportion of false null hy- 
potheses 7Ti and g under dependence. When the H^s are independent, Gen- 
ovese and Wasserman [9] pointed out that there is an unidentifiability is- 
sue in estimating it\ and g from the p- values X\, . . . ,X n . To see this, let 
A G (1 - min < :E <i5'(x), 1/tti), 7tJ = Xir and g*(x) = (g(x) - 1)/A + 1. Then 
we have the identity (1 — 7Ti) + irig(x) = (1 — 7rJ) + irlg*(x), suggesting that 
Xi can also be viewed as a simple random sample from a mixture model 
with the two components: uniform[0, 1] and g* . To ensure identifiability, we 
assume g(l) = 0. Since f(t) = ttq + nig(t), as in Storey [18], we estimate 
7T = 1 - 7Ti by 

(15) tt = 1 ~ Fn ^ Z 6 ) w here < b < 1. 

If / is differentiable at 1, then in the sense of mean squared error the optimal 
bandwidth b = b n x ra -1 / 3 (cf. Lemma 2). Let fri = 1 — ttq be the estimator 
of 7Ti = = 1). The BH procedure can be improved by the plug-in pro- 
cedure: let 

i/pi = sup{i G [0, 1] :tTr /a < F n (t)} 

and reject hypotheses for which X/^ < X( Rpi ), where Rpi = [niypiTto/a\ . We 
argue that in the case of dependence null hypotheses, the plug-in procedure 
also improves the BH procedure by increasing power while it still controls 
the false discovery rate. Let 

= sup{t G [0, 1] : fvro/a < F(t)}. 

Theorem 4. Assume Condition 1, g(l) = and a" 1 > /(z^o). Further 
assume a/iiQ > a* and b n x n -1 / 3 . Then we have (i) 

nR \ ^(ttq -ttq) F n (u^)-F(u^) _ 2/3 

(16 upi -u* = — — + — — + ¥ {n 1 

7r -a/(i/*) 7T /a-/(i/*) 

and (ii) T n (upi) - a = a(l - 7ro/7r ) + Ow(n~ 1 / 2 ). 

3. Examples and simulation studies. Section 3.1 concerns one-dimensional 
processes and Section 3.2 contains an application to Ising models in 1? . In 
both cases we shall show that Condition 1 is satisfied. 

3.1. One- dimensional processes. Assume that (Hi) is a stationary pro- 
cess of the form 

(17) Hi=h(...,r)i-i,r)i,r)i + i,...), 

where rji are i.i.d. random variables or innovations and h is a measurable 
function. By allowing the dependence of Hi on rjj, we are incorporating 
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location information in modeling the dependence among null hypotheses. 
As a simple case, if h in (17) is a function of m (m £ N) arguments: Hi = 
/i(?7i_ m+ i, . . . , r]i), then Hi is m-dependent. Our formulation (17) seems in 
line with the principle that "everything is related to everything else, but 
near things are more related than distant things" (Tobler [21]). 

We now give a simple condition for the CLT n~ 1 / 2 (N n — niti) =^ N(0, a 2 ), 
where N n = J27=i Hi- Let Ti = (. . . , Tft-i, rji) and define the projection oper- 
ator 7>k by Vki = lE(^|JT fc ) — E(£|.Ffc_i) if the latter exists. Assume that 

oo 

(18) c := ^2 Si <oo where Si = \\VoHi\\. 

■/= — oo 

Then ||iV n — mri\\ < c$y/n and the CLT holds (cf. Lemma 1). The quantity Si 
is related to the predictive dependence measure given in Wu [23]. Condition 
(18) indicates that the cumulative impact of r/o in predicting the whole 
sequence (iJj)igz is finite. In this sense (18) is a short-range dependence 
condition (Wu [23]). If (18) is violated, then one enters the territory of long- 
range dependence and one may have a non-Gaussian limit. 

We now verify (18) for truncation indicators of linear processes. Let Hi = 
lZi<z*, where z* G R is fixed and = Ya^=-oo a iVk-i- Here rji are i.i.d. 
random variables and (aj)jgz are real coefficients. Let f v be the density of 
r\i and ao = 1. Assume E(|?7i| ) < oo, d > 0, and c* = sup 2 \ f> n {z)\ < oo. Let 

d! = min(l,d). Then 6 t = OQaif/ 2 ) and (18) holds if Ei G z \ a i\ d ' /2 < 00 • To 
this end, for i ^ let Yi = Zi — a,,r/o. Since Yi — r\i and r\i are independent, 
the density f Yi of Yi satisfies f Yi (y) = Ef^y — (Yi — rji)] < c*. Let F Yi be the 
distribution function of Yi. Then for i ^ 0, 

= E[F Yi (z* + |cti77o|) - F Yi (z* - \air] \)} 

(19) 

< E{min(l,2cw|oi7/o|)} 

<E{(2c*\a tVo \) d '} = 0(\a t \ d '). 

Let rf ,r]i,i E Z, be i.i.d. and Z[ = Yi + a^. Then (19) implies E|l^< 2it — 
!z'<zj = 0(|ai| rf/ ). Observe that E(l Zi < Zt -1 Z /< Z J^" ) = V Hi. By Jensen's 
inequality, <5j = 0(|aj| d '/ 2 ). 

3.2. Ising models. Markov random fields have been widely used in image 
analysis and spatial statistics. Here we shall consider a false discovery control 
paradigm with the null hypotheses (H s ) satisfying the Gibbs distribution in 
Z 2 and thus (H s ) are spatially dependent. Let L s = 2H S — 1. Then L s = —1 
(resp. 1) implies that the null hypothesis H s is true (resp. false). That L s = 1 
may imply that a neuron is excited or a plant is infected. Here we consider 
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the simplest Ising model. For a site s = (j, k) G Z 2 , let J\f s = {(j' ,k') G Z 2 : |j — 
j'\ + \k — k'\ = 1} be the neighborhood of s and write Z 2 \ s = {t G Z 2 : t ^ 
s}. For a set AcZ 2 write = (L a ,a G ^4) and = (Z a ,a G ^4), where 
/ a G {— 1,1}, a G Z 2 . Assume that we have the Markovian structure 

F[L S = l s \L Z 2\ s = l Z 2\ s ] = F[L S = l s \L Ns = l Ns ) 

(20) 

= exp(/3Z B EteAfs l t) 

exp(/3£teA/; h) + exp(-PY,teAf s l t) ' 

where (3 characterizes the interaction between pairs of nearest-neighbor spins 
and it is a function of Boltzmann's constant and the temperature. That 
[3 > (resp. (3 < 0) corresponds to ferromagnetic (resp. antiferromagnetic) 
interaction. The former is an attractive feature in dealing with situations 
in which one expects that false null hypotheses occur in clumps or clusters. 
In the antiferromagnetic case, one has negative dependence which prevents 
false null hypotheses from occurring in clumps. 

With (20), the distribution of H s only depends on the values of H at the 
four neighbors of s. For more details see Winkler [22]. Let /?* = 2- 1 log(l + 
y/2) = 0.4406868 ... be the critical value. If < /3 < then E(L S ) = 0, and 
we can apply the central limit theorem in Newman [14] or Baker and Krinsky 
[1]: the covariance cov(Lq, L s ) — ► decays to zero exponentially quickly as 
|s| — > oo and n~ 1 ^ 2 {N n — n-zri) =^ iV(0,cr 2 ). So Condition 1 is satisfied and 
Theorems 1-4 are applicable. 

Consider the situation that (H s ) are not directly observable and we want 
to test whether H s = or H s = 1. We conduct pixel- wise multiple hypothesis 
tests. Assume that for each site s, under H s = 0, the p-value X s has a uni- 
form(0, 1) distribution while [X s |i/ S = 1] ~ G. Since the underlying (Hi) is 
not observed and one only knows p- values Aj which are calculated from test 
statistics, we are thus dealing with hidden Markov models by viewing (Hi) 
as hidden states. Analysis of the p- value sequence (Xi) is useful in under- 
standing the dependence structure of (Hi) and provides spatial information 
of false null hypotheses. 

In our simulation we choose the lattice set {1, 2, . . . , 50} 2 with periodic 
boundary conditions and choose seven levels of (3: (3 = —0.3, 0, 0.1, 0.2, 0.3, 
0.4 and 0.44. Note that (3 = implies independent null hypotheses and larger 
(3 indicates stronger dependence. The density of the alternative distribution 
is g(x) = a(l + a) 2 /(x + a) 2 - a, x G (0, 1), where a = 1/98. Then g(l) = 0, 
5(0) = 100 and the quantity a* in (10) is 2/101. 

Our simulation study shows that, if the dependence is relatively weaker, 
then r n (z^BH) is more concentrated on oitq and the approximation (13) in 
Theorem 3 is better. We apply the Gibbs sampler with random sweeps 
(Greenwood, McKeague and Wefelmeyer [10]) and the number of iterations 
is 1.25 x 10 6 . Choose the level a = 0.1. For (3 < we have ttq = 1/2 and 



FALSE DISCOVERY CONTROL 



9 



ottq = 0.05. Write 5± = T n (^BH) — ccto and 62 = v$ a[A n (z/n) — A(fn)]. Ta- 
ble 1 shows the estimated E(<5f ) and E(|#x — $2\ 2 ) based on 100 repetitions. 
It suggests that 82 approximates 5\ reasonably well. As the dependence gets 
stronger, E(^) becomes larger and the false discovery proportion T n (z/BH) 
is less concentrated on anQ. 

Genovese, Lazar and Nichols [8] showed that the false discovery rate con- 
trolling procedure can be useful in the analysis of image data. Figure 1 shows 
image restoration based on the p- values under the conditional independence 
model. In our simulation we applied pixel-wise multiple hypothesis tests 
with FDR-controlling procedure and the level is a = 0.1. The first row is 
the simulated Ising images for (3 = 0.3 and 0.44, respectively. The second 
row shows the estimated images and the third row gives the differences. The 
red (resp. blue) dots are false positives (resp. negatives). With larger a (say 
a = 0.15), the number of false negatives is reduced (the simulation is not 
reported in the paper). 

Figure 1 suggests that, if the dependence is strong (e.g., [5 = 0.44) and 
the false null hypotheses are clustered, then it is possible to improve the 
restored images by incorporating the spatial dependence structure. Pacifico 
et al. [15] applied FDR-thresholding to construct conservative confidence 
envelopes for Gaussian random fields. 

4. Proofs. This section provides proofs of results stated in Section 2. For 
readability we list necessary notation here. Recall (6) and (7) for A n (t), F n (t) 
and A n (t). Let N n = Ya=iHi be the total number of false null hypotheses, 

1 n 

(21) A* n {t) = -Y j {l-H l )t = t{l-N n /n) and A£(i) = G(t)N n /n. 

n r - ; 

1=1 



Table 1 

The estimated E(<Sj) and E(|5i — <5a | 2 ) based on 100 
repetitions 








E(|<5i -<5 2 | 2 ) 


-0.3 


4.2 x 10~ 5 


6.2 x 10~ 7 





4.4 x 10~ 5 


1.1 x 10~ 6 


0.1 


5.6 x 10~ 5 


1.7 x 10~ 6 


0.2 


5.7 x 10 -5 


1.5 x 10~ 6 


0.3 


6.0 x 10 -5 


3.4 x 10~ 6 


0.4 


9.5 x 10 -5 


7.6 x 10~ 6 


0.44 


7.1 x 10~ 4 


1.1 x 10~ 4 



Here 5i = T„(ubb) - air and 5 2 = v a 1 a[A„(i/ ) — A(i/ )]. 
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Fig. 1. i?ou> 1: simulated Ising models for j3 — 0.3 and 0.44, respectively. Row 2: restored 
images based on the p-values under the conditional independence model. Here we applied 
pixel-wise multiple hypothesis tests with FDR-controlling procedure and the level is a — 0.1. 
Row 3: the differences between the restored images and the original ones. Dots in red (resp. 
blue) are false positives (resp. negatives). 

Write F* = A* n + A* . Define 

1 n 

(22) On(t) = -Y i (l-H i )(l Ui< t-t), 

n r - • 

i=i 
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(23) E n (t) = -Y,Hi[l Ui <G®-G{t)]. 

1 = 1 

Lemma 1. Assume (18). Then \\N n — rmi\\ < c^y/n and n~ l l 2 {N n — 
nvri)^iV(0,cJ 2 ). 

PROOF. By stationarity, HPfc-^nll < Z)"=i $i-k - c o- Since V k are orthog- 
onal, we have 

n 

\\N n - nmW 2 = WPkN n f < coE^ = nc l 

k£Z keZ i=l 

A similar version of the CLT is given in Hannan [11] and the argument 
therein is applicable here. Let D k = J2iez'PkHi and M n = Efc=i A- Then 
A are stationary martingale differences. Let Uj = J2i^Lj and lj = J2i=-oo 
Since V k , k E Z, are orthogonal, 

\\N n - nvn - M n f = \\V k (N n - M n )f. 

fcgZ 

If k < 0, then \\V k (N n -M n )\\ = \\V k N n \\ < ELi^-fc- So EL-oo ll^^nll 2 < 
coJ2i=i u i = °( n ) since u m -> as m ->• oo. Similarly, J2k=n+i WPkN n \\ 2 = 
o(n). For 1 < A; < n, since V k M n = D k , \\V k {N n - M n )\\ < u n+l _ k + Z_ fe . So 
we also have TJk=\ \\V k (N n - M n )\\ 2 = o(n). Thus \\N n - nvri - M n \\ 2 = o(n). 
By the martingale CLT, M n j^fn =>■ iV(0, cr 2 ) with cr = ||A||- So the lemma 
holds. □ 

Lemma 2. Assume sup x£ ^ 01 ] g(x) < oo. Let b n be a sequence of band- 
widths satisfying 

(24) b n — > and nb n — > co. 
XTien under Condition 1, w;e /iaue 

(25) y^[F n (6 n )-F(6J]^iV(oj(o)) 
and 

(26) yf^[F n (l - bn) - F(l - b n )} => N(0, /(l)). 

Proof. Denote by \/^-T the imaginary unit. Let 

A = (1 - AXlt^ftn " &n) + fl"i(l^<G(6n) - 
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and Q n = Yli=i ^i- Let t S R be a fixed number and t n = t/^Jnb n . Under the 
conditional independence model (4), the conditional characteristic function 

(j) n (t) := E[exp{V^lt n Q n )\Hi, 1 < i < n] 

= [G(b n )exp(V^lt n (l-G(b n ))) 

+ (1 - G(b n ))exp(-y^lt n G(b n ))] Nn 

+ [6 n exp(^Tt n (l - b n )) x (1 - b n ) exp(-y/^lt n b n )] n ~ Nn . 

By Condition 1, N n /n — > 7Ti in probability. Using Taylor's expansions exp(5) = 
1 + 5 + <5 2 /2 + 0(5 3 ), G(5) = Sg(0) + o(6), after elementary calculations we 
have 

4> n (t) exp{-t 2 /[27r + 2tti 5 (0)]} = exp{-i 2 /[2,f (0)]} in probability. 

By the Lebesgue dominated convergence theorem, E[0 n (t)] — ► exp{— 1 2 /[2/(0)]} 
since \<f> n (t)\ < 1. So Q n /VnK =>- iV(0, /(0)). By Condition 1, since G(6 n ) = 
0(b n ), we have 

|ra[F n (6 n ) - F(6 n )] - Q n | < [b n + G(6 n )]|7V n - wri| = P (6 nV ^). 

So (25) follows since b n — > 0. The other assertion (26) can be similarly proved 
by considering D[ = (1 - F i )(l [ / i> i_ 6n - & n ) +-Hi(li/ i >i-G(& n ) ~ <?(&„)). □ 

Lemma 3. Lei 6 n 6e a sequence of positive numbers satisfying b n £ (0, 1) 
and n6 n — > oo. Assume that sup xg r 0! i] < oo. T/ien we have 



(27) 

and 
(28) 



sup n|fi n (t + u) - a n (t)\ = P [(n6 n ) 1/2 ] 
|u|<6 n 



sup n\E n (t + u)- En(t)\ = Op[(nb n ) 1/2 \ 

\u\<b n 



Proof. For i.i.d. uniform(0, 1) random variables Ui, i € Z, let W n («) 



such that 
(29) 



nu. By Lemma 2.3 in Stute [20], there exists a constant cq 



sup |W n (u + i) - W n (i)| > sVnb 

l0<u<b 



< 4e - 2 /16 



holds for all < b < 1/8 and 32 < s < coVnb. Since (Hi) is 0/1-valued and 
it is independent of Ui, it is easily seen that (29) implies 



sup \nfl n (t + u) — nQ n (t)\ > sv nb 

0<u<b 



< 4e 



-s 2 /16 



So we have (27) since nb n — > oo. A similar argument entails (28). □ 
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Lemma 4. Assume Condition 1 and sup xg [ 0i i] g{x) < oo. Let b n £ (0,1). 
Then for A* (t) and A* (i) defined in Lemma 3, we /iaue 

(30) sup n\[A* n (t + u)-A(t + u)]-[A* n (t)-A(t)]\=0 ¥ (b n n^ 2 ) 

\u\<b n 

and 

(31) sup n\[A* n (t + u)-A(t + u)}-[A* n (t)-A(t)]\ = P (b n n 1 / 2 ). 

\u\<b n 

PROOF. Since A* (t) = G(t)N n /n and sup^g^i] g(x) < oo, by Condition 
1, we have (31). Similarly (30) follows. □ 

4.1. Proof of Theorem 1. By the weak convergence theory (Billingsley 
[4]), it suffices to establish (i) the finite-dimensional convergence and (ii) the 
tightness. 

We first show that the process i/n{A n (t) — A(t)} is tight. Let L n (t) = 
A* n (t) - A(t) = £(71-1 - AT„/n), < t < 1. By Condition 1, v^C^i - A^/n) = 
Op(1). So y/nL n (t) is trivially tight. Note that A n (i)-A(i) = Q n (t) + L n (t). 
Following the tightness argument for the process n -1 / 2 J2?=i(^-Ui<u — u), < 
u < 1 (cf. Theorem 16.4 in Billingsley [4]), since (ilj) and (C/j) are indepen- 
dent, we can easily derive that the process y/n£l n (t) is also tight. Similarly, 
we can show that ^/n{A n {t) — A(t)} is tight by noting that sup 0<t<1 g(t) < 
oo. So (^E{A n (t) - A(t)}, y/n{&n(t) - A(t)}) is tight. 

We now show the finite-dimensional convergence. Let a, b be two real 
numbers; let 

n 

(32) T n = ^(Ji-EJi) where Jj =a(l-fTi)l£7 i <t + 6fl'ili7 i <G(t)- 

We shall calculate the characteristic function p n (9) = E{exp[0\/— TT n /^/ra]}, 
€R. Let A(0) = logE{V=T6>li7 1 <t} and B(0) = logE{ v / ^T6»l G(c/l) < t }. Then 
for small |<5|, we have 

A(5) = log(l - t + te^ s ) = tdV^l - —t(l -t) + 0(5 3 ), 

B(5) = log{l - G(t) + G(t)e^ 5 } 

= G (t)5^l - ^G(t)[l - G(t)) + 0(<5 3 ). 

Let v = G(t)6b - Wa, g = t(l - t)6 2 a 2 /2 and Ql = G(t)[l - G(t)]8 2 b 2 /2. 
With the preceding two relations, since (N n — wk\ )/ v /n=>- N(0,a 2 ), as the 
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argument for (p n (t) in the proof of Lemma 2, we have 

lim if (9) = lim ® e M(n-N n )A(ea/vK)+N n B(eb/^)} 



exp{\/^19^/n[air t + bniG(t)]} 
= lim Eexp{(iV n — mv\)\f— lv/y/n 

n— >oo 

(33) 

-(l-N n /n)Q -{N n /n) 6l } 

= Eexp{— v 2 a 2 /2 — ttqQq — ttiQi} 

after elementary manipulations. Hence T n /\/n is asymptotically normal. 
Consequently, by the Cramer-Wold device, the finite-dimensional conver- 
gence follows. 

4.2. Proof of Theorem 2. (i) Let b n be a real sequence with b n € (0, 1) 
and nb n — ► oo. Since b n < \/b^, by Lemmas 3 and 4, we have 

(34) sup n\[F n (t + u)-F(t + u)}-[F n (t)-F(t)}\=0 ¥ ((nb n ) 1 / 2 ). 

\u\<b n 

We first show that v^^bh — ^o) = Op(l). To this end, it suffices to show that 
for any positive sequence B n — > oo, \/n{v BB — ^o) = Op(B n ). Without loss of 
generality assume B n < logn since otherwise we can let B' n = min(l? n , logn). 
Applying (34) with b n = B n /y/n, since F n (vo) — F(vq) = Op(n~ 1 / 2 ), we have 

F n (u + b n ) = F(u + b n ) + [F n (u ) - F(u )] + P ((6 n /n) 1 /2) 

(35) 

= F(UQ) + b n f(w) + 0(b 2 n ) + P (n~ 1 /2 ) . 

Note that t > v B u if and only if t/a > F n (t). Since l/a > f(i>o), F(uq) = v§ja 
and S n — > oo, we have by (35) that 

P(fo + b n > z^bh) = F[(yo + &„)/<* > F n (u + b n )\ 1 

as n — > oo. Similarly, P(^o — b n < v BB ) — ► 1- So v^^bh — ^o) = Op(l)> which, 
by another application of (34) with b n = C/y/n, implies 

(36) n\[F n (u BU ) - F(u BU )] - [F n (u ) - F(u )]\ = P (n 1 / 4 ). 

Since \F n (v BH ) - "bh /ot\ = 0(n _1 ) and F(v BU ) = F(u ) + (v BB - ^o)/(^o) + 
P (n- 1 ), (11) follows. 

The CLT \/n(u BB — uq) =>■ iV(0, a 2 ) easily follows from (11) in view of the 
argument of (32) and (33) in the proof of Theorem 1: let a = b= 1 in (32), 
then Ji = lxi<t- 

(ii) As in (i) we shall show that for any positive sequence B n — > oo, R = 
Op(B n ). To this end, let b n = B n /n, t n = n[b n — F(ab n )]/\/ n b n and 

n{F n (abn)-F(ab n )) 
Vnbn~ 
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Since b n — > and nb n = B n — > oo, by Taylor's expansion, F(ab n ) = f(0)ab n + 
o(b n ). Hence t n /y/nb n — ► 1 — a/a* > 0. So t n — > oo. By Lemma 2, Z n 
JV[0,a/(0)]. Therefore 

P(i? < J3 n ) = P[F n (a6 n ) < bn] = ¥(t n > Z n ) 1. 

(iii) Let z > be fixed and 6 n = n _1 / 3 z. By Taylor's expansion, F{b n ) = 
6n/(0) + #/'(0)/2 + o(62). Hence u n := ^/K[b n /a - F(b n )} 
-/'(0)z 3 / 2 /2- By Lemma 2(i), 

P(^BH < bn) = F[F n (b n ) < b n /a] 

= ¥{^/K[F n {b n ) - F{b n )\ < u n } 

Jww<-^ 3/2 

= P{[max(AA/c ,0)] 2/3 <z}, 

which proves (12). 

4.3. Proof of Theorem 3. (i) By Theorem 2(i), vbr — vq = Op(l/y/n). 
Similarly as in the proof of (36), by Lemmas 3 and 4, we have 

(37) A n (^ BH ) = An(^o) + A(^bh) - A(i/ ) + P (n~ 3 / 4 ). 

Recall A(t) = tir^. Observe that F(ubu) — F(vq) = (vbk — z/ o)/( z/ o) + Op(l/n), 
F n (uo)=F(uo) + Op(l/y/n) and, by (36), F n (v BR ) = F{v ) + O p (1/ ^Ti). By 
(36) and (37), we have 

1 n 

(38) A n (vBK)F(v ) ~ F n (vB K )A(vo) = - V{ J { - E( J;)} + 
where 

J, = F(v )(l - Hi)l Xl <v ~ A(fo)lXi<i/o 
F(u )A'(u ) - f(u )A(u ) 



" - /(^o) 

Since F{uq) = uo/a and A'(z/q) = vro, simple calculations show Jj = i ? (z^o)(l — 
ffi)lXi<vo- Note that F(i>q) < uq and F(vq) = ■kqVq + ttiG(vq). Then G(vq) > 
uq. Using the property of conditional independence, 

pfminX, > u \H u ...,H n )=(l- ^""""(l - G(u )f n < (1 - u ) n . 

So P(minj< n Xj > vq) < (1 — vo) n and hence (13) follows from (38) by noting 
that F n (vBn) = F(iy ) + O ¥ (l/^E). The CLT ^[T n (v BU ) - air ] =► N(0,o%) 
follows from (33). 

(ii) The argument is similar to the one in (i). We have an analog of (37) 
with A n (-) therein replaced by A n (-) and (14) similarly holds. The CLT also 
follows from (33). Details are omitted. 



16 



W. B. WU 



4.4. Proof of Theorem 4. For (16), the argument in the proof of Theorem 
3 is applicable. Let B n be a positive sequence that diverges to infinity slower 
than logn; let r n = b n + (nb n )~ l l 2 >c ra" 1 / 3 . By Lemmas 3 and 4, 

F n {v* + r n B n ) = F(v* + r n B n ) 

(39) + [F n {y m ) - F{v*)} + ¥ [(r n B n /n) l l 2 ) 

= Ffa) + f(v*)r n B n + 0(r 2 B 2 ) + O^rC 1 / 2 ). 

By Lemma 2, ^/nb^ J (^TQ — ~Ettq) =>■ iV(0, jf(l)). Since 6 n x re -1 / 3 and B n — > oo, 
we have 

(40) P{(i/» + r n B n )(n - Ett ) > - Evr ) + r n B n [a/(i/„) - Evr ]} 1 

since /(f*) < vro/a and E7i"o = + 0(b n ). Note that F(z/*) = -KQU^/a. By 
(39) and (40), 

P[(i/* + r n B„)^ > aF n (i/„, + r„J3 n )] -» 1, 

which implies that P(i/pi < ^* + r n B n ) — > 1. Similarly, we have P(z/pi > — 
r n B n ) — > 1 and hence i/pi — 1>* = Op(r n ). By Lemmas 3 and 4, 

(41) F n (u m ) = F(kpi) + [^(i/,) - F(^)] + Opf^/n) 1 / 2 ]. 

Since |F n (i/pi) — ^pi7r /a| < n" 1 and F{vp\) - = (z/pi — v*)f(v*) + 

Op(r„), (16) follows from (41) after elementary calculations. 

Using the argument in the proof of Theorem 3, we can similarly obtain 
(ii) with no essential difficulties. Since the calculation is lengthy, the details 
are omitted. 
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